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Abstract 

Background: Human papillomavirus-positive (HPV+) head and neck squamous cell carcinoma (HNSCC) represents 
a distinct clinical and epidemiological condition compared with HPV-negative (HPV-) HNSCC. To test the possible 
involvement of epigenetic modulation by HPV in HNSCC, we conducted a genome-wide DNA-methylation analysis. 

Methods: Using laser-capture microdissection of 42 formalin-fixed paraffin wax-embedded (FFPE) HNSCCs, we 
generated DNA-methylation profiles of 18 HPV+ and 14 HPV- samples, using Infinium 450 k BeadArray technology. 
Methylation data were validated in two sets of independent HPV+/HPV- HNSCC samples (fresh-frozen samples and 
cell lines) using two independent methods (Infinium 450 k and whole-genome methylated DNA 
immunoprecipitation sequencing (MeDIP-seq)). For the functional analysis, an HPV- HNSCC cell line was transduced 
with lentiviral constructs containing the two HPV oncogenes (£6 and E7), and effects on methylation were assayed 
using the Infinium 450 k technology. 

Results and discussion: Unsupervised clustering over the methylation variable positions (MVPs) with greatest 
variation showed that samples segregated in accordance with HPV status, but also that HPV+ tumors are 
heterogeneous. MVPs were significantly enriched at transcriptional start sites, leading to the identification of a 
candidate CpG island methylator phenotype in a sub-group of the HPV+ tumors. Supervised analysis identified a 
strong preponderance (87%) of MVPs towards hypermethylation in HPV+ HNSCC. Meta-analysis of our HNSCC and 
publicly available methylation data in cervical and lung cancers confirmed the observed DNA-methylation 
signature to be HPV-specific and tissue-independent. Grouping of MVPs into functionally more significant 
differentially methylated regions identified 43 hypermethylated promoter DMRs, including for three cadherins of 
the Polycomb group target genes. Integration with independent expression data showed strong negative 
correlation, especially for the cadherin gene-family members. Combinatorial ectopic expression of the two HPV 
oncogenes (£6 and £7) in an HPV- HNSCC cell line partially phenocopied the hypermethylation signature seen in 
HPV+ HNSCC tumors, and established £6 as the main viral effector gene. 

Conclusions: Our data establish that archival FFPE tissue is very suitable for this type of methylome analysis, and 
suggest that HPV modulates the HNSCC epigenome through hypermethylation of Polycomb repressive complex 2 
target genes such as cadherins, which are implicated in tumor progression and metastasis. 
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Background 

Head and neck cancer is the sixth most common cancer 
worldwide, with an incidence of around 600,000 cases 
per year, with rising trends particularly in young people 
[1,2]. Despite recent advances in the treatment and in 
the understanding of its biology, the 5-year survival rate 
of 50% for patients with head and neck cancer has on 
the whole remained largely unchanged for the past three 
decades, with only some advances since the 1990s [3]. 
The most common type of head and neck cancer is 
squamous cell carcinoma (HNSCC). Human papilloma- 
virus (HPV) represents a major independent risk factor 
for HNSCC. HPV is particularly associated with oro- 
pharyngeal carcinoma, of which 20 to 50% test positive 
for the HPV-16 subtype, with expression of the E6 and 
£7 viral oncogenes [4-6]. HPV-positive (HPV+) HNSCC 
represents a distinct molecular, epidemiologic, and clini- 
cal condition [7,8], and responds better than HPV- nega- 
tive (HPV-) to chemotherapy and radiotherapy (82% 
response rate for HPV+ versus 55% for HPV- cases) and 
has a better disease-free and overall survival (95% versus 
62% at 2 years) [9]. Individuals with HPV+ HNSCC 
have a lower rate of second primary tumors, and a 
decreased cumulative incidence of relapse [10,11]. Thus, 
knowledge of a patient's HPV status offers the possibi- 
lity of stratifying such patients for treatment and of elu- 
cidating the mechanisms underlying the virus-associated 
advantage in drug response and survival in HNSCC. 

The causes responsible for the different clinical behavior 
between HPV+ and HPV- tumors remain poorly under- 
stood. Numerous studies comparing gene expression pat- 
terns of HPV+ and HPV- cancers have shown different 
profiles for the two groups [12-16]. It is therefore likely 
that virus-mediated changes in both the genome and epi- 
genome account for this differing clinical behavior. Deep 
exome sequencing of HPV+ and HPV- HNSCC recently 
confirmed mutations in TP53 as a potential genomic stra- 
tifier for HPV status [17,18]. Analysis of the epigenome is 
more complex, and the majority of studies have therefore 
focused on the methylome, because DNA methylation is 
the most accessible epigenetic modification in clinical 
samples [19]. 

Changes in DNA methylation play a key role in malig- 
nant transformation, leading to the silencing of tumor- 
suppressor genes and overexpression of oncogenes [20]. 
In virus-induced cancers, methylation changes have 
been described in both the host [21,22] and viral [23,24] 
methylomes. A recent study [25] comparing two HPV+ 
with two HPV- HNSCC cell lines showed that HPV 
infection is associated with changes in methylation of 
host genes, and led us to embark on a comprehensive 
study of HPV-mediated DNA methylation in HNSCC 
tumors. The study identified five Polycomb repressive 
complex 2 (PRC2) targets among the hypermethylated 



promoters. Polycomb group (PcG) proteins are tran- 
scriptional repressors, which modify histone tails to rever- 
sibly suppress genes required for differentiation. These 
proteins play a major role in neoplasia [26], and their 
oncogenic function is associated with a well-established 
role in stem-cell maintenance. Stem-cell PcG targets were 
shown to be 12 times more likely than non-targets to have 
cancer-specific promoter hypermethylation [27-29], sup- 
porting the theory of a stem-cell origin of cancer [30]. 

Based on these findings, we hypothesized that HPV 
modulates the epigenome in HNSCC, and set out to test 
this by comprehensive methylome analysis of HPV+ and 
HPV- primary tumors and cell lines (with the HPV- tis- 
sues serving as the control group, similar to a variety of 
previous expression studies in the field of HNSCC [14,15]. 
In addition, we aimed to phenocopy any HPV-mediated 
DNA-methylation signature by ectopic expression of HPV 
oncogenes in HPV- HNSCC cell lines. 

Methods 

Ethics approval 

Ethics approval for this study was granted by the ethics 
committee of University College London/University Col- 
lege London Hospitals (UCL/UCLH; (reference number 
04/Q0505/59), and informed consent was obtained 
where required. 

Patient samples and clinical data 

We obtained 107 archival formalin-fixed paraffin wax- 
embedded tissue (FFPE) oropharyngeal cancer samples 
from the Department of Histopathology (UCLH) and 
tested for HPV status. Of these, 21 HPV+ and 21 HPV- 
age-matched samples were selected for methylation ana- 
lysis (see Additional File 1, Figure SI for workflow of 
FFPE sample preparation and selection). Histological 
diagnosis was confirmed by an experienced histopathol- 
ogist, and correlated with clinical findings (see Addi- 
tional File 2, Table SI). Furthermore, three fresh-frozen 
(FF) HPV+ and HPV- HNSCC samples (see Additional 
File 2, Table S2) were obtained from the UCLH Head 
and Neck Tumour Bank. 

Assessment of HPV status 

HPV status was determined by CDKN2A (pl6) immu- 
nostaining of the corresponding FFPE blocks, and con- 
firmed by E6 quantitative (q)PCR on DNA extracted 
from both FF and FFPE samples. This combination of 
tests has been shown to have 97% sensitivity and 94% 
specificity, and to be the best discriminator of favorable 
outcome (see Additional File 2, Table S3) [31]. 
pi 6 staining 

pl6 staining was performed using a fully automated 
immunohistochemistry staining system (Bond™-III; 
Leica Microsystems, Inc., Buffalo Grove, IL, USA). 



Lechner et al. Genome Medicine 2013, 5:15 
http://genomemedicine.eom/content/5/2/15 



Page 3 of 16 



Sections 3 |im thick were cut from a total of 82 FFPE 
blocks set from HNSCC samples (see Additional File 1, 
Figure SI), and were prepared for pl6 staining. Using 
the staining system, slides were dewaxed (Bond Dewax 
Solution; Leica Microsystems) in accordance with the 
manufacturer's recommendation (protocol '*D'). Antigen 
retrieval was conducted using the accompanying solu- 
tion (Bond ER1; Leica Microsystems) for 30 minutes in 
accordance with the manufacturer's protocol (*H1(30)). 
Staining was then performed using the accompanying 
detection kit (Bond Polymer Refine Kit; Leica Microsys- 
tems) in accordance with the manufacturer's protocol 
(15,8,8), using the pre-diluted pl6 antibody clone 
(E6H4™; Roche mtm Laboratories, Heidelberg, Ger- 
many), and a negative reagent control (CINtec; Ventana 
Medical Systems, Inc., Tucson, AZ, USA). The stained 
slides were examined by two experienced histopatholo- 
gists, and scored as described previously [32],. Subse- 
quently, pl6 positive areas of HPV-positive tumor 
samples were subjected to laser-capture microdissection 
(so the relative amount of pl6-positive tumor cells used 
for testing should be close to 100%) and tumor samples 
showing a mixed staining pattern (« = 8) were excluded 
from further analysis (see Additional File 1, Figure SI). 
E6 qPCR 

DNA from both laser-capture microdissected HPV+ and 
HPV- HNSCC samples was used for E6 qPCR (see Addi- 
tional File 1, Figure SI). E6 qPCR was optimized using 
primers and TaqMan probes (and using glyceraldehyde-3- 
phosphate dehydrogenase (GAPDH) as a housekeeping 
control), to test for the DNA regions of interest. 

Primer and probe sequences have been published pre- 
viously ([33]; see Additional File 2, Table S4; see Addi- 
tional File 2, Table S5). DNA was amplified using qPCR 
with 25 ul 2x Buffer A (ABgene, Epsom, Surrey, UK), 
0.3 umol/1 forward primer, 0.3 umol/1 reverse primer, 
and 0.15 umol/1 TaqMan probe, in a total volume of 
50 ul. As controls, a housekeeping gene (GAPDH) and 
water sample were included in each PCR setup. qPCR 
was performed using a PCR cycler (Realplex Mastercy- 
cler; Eppendorf, Stevenage, UK) applying the following 
qPCR program: denaturation at 95°C for 15 minutes, 
followed by 40 cycles of 95°C for 15 seconds and 60°C 
for 60 seconds, with no extension step. All reactions 
were run in duplicate (two reactions at lx concentration 
and two reactions with 1:10 dilution). The E6 qPCR is 
specific for HPV type 16, which is found in the vast 
majority of HPV-positive HNSCC specimens. A selected 
number of our samples were also tested for HPV type 
18 and for low-risk HPV types (including HPV type 6 
and HPV type 11, causative agents in laryngeal papillo- 
matosis) by in situ hybridization (analysis was performed 
by UCL Advanced Diagnostics, University College 
London, London, UK). The results on all tested samples 



were negative, and HPV type 16 was the only HPV type 
that was detected. 

Laser-capture microdissection 

Laser-capture microdissection (LCM) was carried out on 
slides (PALM MembraneSlide 1.0 PEN; Zeiss Microima- 
ging, Munich, Germany) using an automated processor 
(PALM Microbeam™ system; Zeiss Microimaging). 
Depending on tumor size and pathology annotation, clus- 
ters of tumor cells were microdissected from one or more 
slides of the same FFPE block. For the HPV+ and HPV- 
samples, only the respective pl6-positive and pl6-negative 
tumor areas were dissected. The captured cells were esti- 
mated to contain 80% or more tumor cells. 

DNA extraction 

DNA was extracted using commercial kits from the FF 
tumor samples (QIAamp DNA Blood Mini Kit (Qiagen 
GmBH, Hilden, Germany) and the laser-dissected FFPE 
samples (QIAamp DNA FFPE Tissue Kit; Qiagen 
GmBH). 

Genome-wide methylation analysis 

DNAs were prepared in a total volume of 20 ul (1 ug of FF 
and cell-line DNAs and 2 ug of FFPE DNA per sample) 
using a previously optimized protocol [34], in conjunction 
with two commercial kits (REPLIg FFPE kit; catalog num- 
ber 150243; Qiagen GmBH) and EZ DNA Methylation kit 
(catalog number D5001; Zymo Research Corp, Orange, 
CA, USA)). The latter kit was modified to improve bisul- 
fite conversion efficiency by inclusion of a cyclic denatura- 
tion step as described previously [34]. A microarray 
platform (Infinium HumanMethylation450 BeadChips; 
Illumina Inc., San Diego, CA, USA) was used, which was 
processed by the UCL Genomics Core Facility in accor- 
dance with the manufacturer's recommendation. The 
scanned data and image output files were managed with 
Genomestudio software (version 1.9.0; Illumina). 

R statistical software (version 2.14.0 [35]) was used for 
the subsequent data analysis. Raw data were subjected 
to a stringent quality-control analysis as follows. Sam- 
ples showing reduced coverage were removed, and only 
probes with detection levels above background across all 
samples were kept (detection P < 0.01), resulting in a 
raw data matrix of 439,385 probes and 32 samples (18 
HPV+ and 14 HPV-). This raw data matrix was then 
subjected to a principal component analysis to deter- 
mine the nature of the largest components of variation. 
We used random matrix theory (RMT) to estimate the 
number of significant components of variation [36,37]. 

The 450 k BeadChips contain two types of probes (type 
1 and 2) which have slightly different profiles. Although 
there have been attempts to normalize for that difference 
[38], we found that both the proposed normalization 
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methods and the in-house methods that we developed 
overcorrected the data, leading to worse performance as 
evaluated using a rigorous training-test set partition strat- 
egy. Thus, in our supervised analysis, we treated both 
types of design probes equally, and carried out a posteriori 
testing for a potential skew favoring type 1 probes. 
Although there were only 1,075 type 1 probes among the 
top 2757, this amounted to an over-enrichment, with an 
odds ratio of 1.48 (P < 1 x 10 16 ). However, after correcting 
for differences in CpG density between type 1 and type 2 
probes, the enrichment odds ratio favoring type 1 probes 
was significantly reduced to 1.13 (P = 0.03 approximately). 
Thus, there was no substantial skew favoring type 1 
probes, and we found that normalizing for the design 
using the peak-based correction method of Dedeurwader 
only led to overcorrection and increased technical variabil- 
ity (see Additional File 1, Figure S2). 

All normalized and raw 450 k methylation data were 
submitted to the Gene Expression Omnibus (GEO; 
National Center for Biotechnology Information, 
Bethesda, MD. USA) in accordance with the instructions 
provided (GEO accession numbers: GSE38266, 
GSE38268, GSE38270 and GSE38271). 

Hypermethylation signature 

To quantify the strength of the association and to adjust 
for multiple testing, we estimated the false-discovery rate 
(FDR) using the q- value procedure [39]. Because the ana- 
lytical q-value estimates assume independence of the 
underlying tests, which does not necessarily apply to 
neighboring probes that are spatially correlated, we also 
estimated the FDR using a permutation approach that pre- 
serves the potential correlation structure of proximal 
probes. However, empirical and analytical FDR estimates 
were in close agreement (see Additional File 1, Figure S3). 
Both procedures estimated approximately 2,750 methyla- 
tion variable positions (MVPs) at FDR of less than 0.01, 
that is, less than 1% of the 2,757 probes are expected to be 
false positives. Probes from the X and Y chromosomes 
were removed when obtaining the methylation signature. 

Copy number variation analysis 

Copy number variation (CNV) analysis was performed on 
the DNA of the three HPV+ and three HPV- FF HNSCC 
samples using a genotyping array (HumanOmnil-Quad 
BeadChip; Illumina). This analysis was required for the 
normalization of the methylated DNA immunoprecipita- 
tion sequencing (MeDIP-seq) data. CNV data were ana- 
lyzed using Genome Studio software (Illumina). 

Whole-genome methylation analysis with MeDIP-seq 

DNAs from three HPV+ and three HPV- fresh-frozen 
HNSCC samples were subjected to autoMeDIP-seq as 
previously described [40], using a master mix (NEBNext 



DNA Sample Prep Mastermix; New England Biolabs, 
Beverly, MA, USA) for the library preparation and mag- 
netic beads (MagMeDIP; Diagenode, Liege, Belgium) for 
the immunoprecipitation. Adequate enrichment of the 
methylated DNA fraction (compared with input) was 
quality controlled using qPCR. Following adapter- 
mediated PCR, the library was subjected to size selection 
(300 to 350 bp) using low melting-point agarose gels. 
The excised fraction was quality controlled by qPCR. 
Cluster generation and 36 bp end sequencing was per- 
formed, by the UCL Genomics Core Facility, using an 
genome analyzer (GAIIX; Illumina) in accordance with 
the manufacturer's recommendation. 

The data were analyzed using the MeDUSA pipeline 
[41]. Reads were aligned to the reference genome 
(Human assembly GRCh37) using the alignment soft- 
ware BWA (version 0.5.8) [42], with default parameters. 
Filtering was performed using SAMtools (version 0.1.9) 
[43] to remove erroneously mapped and low-quality 
(score of < 10) reads. Only reads forming a correctly 
aligned pair were kept. A final filtering step removed 
potential PCR artifacts by discarding all but one read- 
pair within groups of non-unique fragments (see Addi- 
tional File 2, Table S6). Read quality was ascertained 
using FastQC [44] and the Bioconductor package MED- 
IPS (version 1.0.0) [45]. Probes from the 450 k Bead- 
Chips located within CpG island regions were isolated, 
and these sites were extended to create 500 bp windows. 
Absolute methylation scores for each of these regions 
were calculated from our MeDIP read files using MED- 
IPS. Methylation scores were calculated for each 
extended probe site using default values. 

All normalized and raw MeDIP-seq data were sub- 
mitted to GEO (NCBI) in accordance with the instruc- 
tions provided (GEO accession numbers: GSE38263). 

Integration of obtained methylation data with publicly 
available methylation data on cervical cancer and lung 
cancer 

R statistical software v2.15.1 [35] was used for pre-proces- 
sing of data and for classic MDS)(principal coordinates 
analysis). MDS was used to visualize HPV+ and HPV- 
HNSCC methylation signatures within methylation data- 
sets obtained from an HPV-induced cancer type (cervical 
cancer; GSE32861) and an smoking-induced cancer type 
(lung cancer; GSE30759). In detail, for the lung-cancer 
dataset, 27,578 probe IDs for 59 lung-cancer samples 
(from a total of 118: 59 lung-cancer and 59 adjacent-tissue 
samples) were selected. For the cervical-cancer dataset, 
27,578 probe IDs for 48 cervical-cancer samples (from 63 
samples in total) were selected. The relevant methylation 
data from the processed dataset of 18 HPV+ and 14 HPV- 
HNSCCs were extracted (439,385 probe IDs for 32 
HNSCC samples). Probe IDs culminated in data matrices 
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of 27,300 probe IDs (59 lung-cancer samples), 26,871 
probe IDs (48 cervical-cancer samples), 439,385 probe IDs 
(18 HPV+ HNSCC samples), and 439,385 probe IDs (14 
HPV- HNSCC samples). All the datasets were restricted to 
the common probe IDs (24,145 probe IDs) by the 'inter- 
sect' function in R. As outlined above, 2,757 MVPs were 
identified (using a Bayesian regularized t-statistics model) 
with an FDR of less than 0.01 in HPV+ HNSCC, com- 
pared with HPV- HNSCC. The respective probe IDs were 
intersected with the common probe IDs identified in each 
dataset (lung cancer, cervical cancer, HPV+ HNSCC, 
HPV- HNSCC). This resulted in 90 common probe IDs 
identified across all the datasets, and represents the selec- 
tion of HPV- associated versus smoking-associated features 
tested. To check whether the same pattern was present in 
cervical cancer versus lung cancer, multidimensional scal- 
ing (MDS) of the samples was created using a Euclidean 
distance measure after scaling all common probe ID fea- 
tures, with the R 'cmdscale' function. A Wilcoxon rank 
sum test was used to test inter-sample distances between 
the evaluated datasets (the set of distances between HPV+ 
HNSCC and cervical-cancer samples was compared with 
the set of distances between HPV+ HNSCC and lung-can- 
cer samples). 

Cell culture and production of SCC003 clones expressing 
HPV-16 oncogenes 

HNSCC cell lines UPCI:SCC090 (HPV+), UPCLSCC003 
(HPV-), UPCI:SCC036 (HPV-) and PCI-30 (HPV-) (gener- 
ous gifts from Dr Susanne Gollin and Dr Theresa White- 
side, University of Pittsburgh Cancer Institute, Pittsburgh, 
PA, USA). 93VU-147T (HPV+) (generous gift from 
Dr. Hans Joenje, VU Medical Center, Netherlands) and 
UM:SCC047 (HPV+) (generous gift from Dr Thomas 
Carey, University of Michigan, Ann Arbor, MI, USA) were 
used. All cell lines (see Additional File 2, Table S7) were 
maintained in DMEM supplemented with 10% FBS and 
penicillin/streptomycin. Clones of SCC003, expressing 
either empty vector control, HPV-16 E7, HPV-16 E6, or 
both HPV-16 E6 and E7, respectively, were generated by 
infection of the SCC003 cell line with retroviruses fol- 
lowed by single-cell cloning as follows. Viruses were pro- 
duced by transfecting human embryonic kidney (HEK) 
293T cells with pLXSN (empty vector or containing the 
HPV-16 E6, E7, or E6&.E7 cDNAs, kind gifts from Dr 
David Beach), together with pHIT- VSVG and MLV-gag/ 
pol (kind gifts from Dr Juan Martin-Serrano) using poly- 
ethylenimine (Polysciences, Inc., Warrington, PA, USA), 
then 72 hours post-transfection, viruses were harvested by 
removal of the medium and filtration through 0.45 um 
surfactant-free filters (Nalgene Nunc International Cor- 
poration, Rochester, NY, USA). The filtered virus stocks 
were either frozen at -80C or diluted 1:2 in DMEM with 
10% FBS and 8 ug/ml hexadimethrine bromide (Polybrene; 



Sigma-Aldrich, St Louis, MO, USA to give a final concen- 
tration of 4 ug/ml, then added to SCC003 cells grown to a 
confluence of 40 to 50%. Following overnight incubation, 
the cells were washed to remove virus, and the medium 
was replaced with DMEM plus 10% FBS. At 48 to 
72 hours post-infection, cells were passaged at a ratio of 
1:5 into selection medium containing 400 ug/ml G418. 
Following death of all mock-infected cells (approx. 1 to 
2 weeks), cells were removed from selection, and plated at 
limiting dilution in 96-well plates to generate single-cell 
clones. HPV-16 E6 and E7 qPCR was conducted as 
described previously [33,46] . To assess E6 and E7 expres- 
sion levels in £6-transduced, £7-transduced, and E6+E7- 
transduced SCC003 cell-line clones (and empty vector 
controls), qPCR was performed on cDNA following 
reverse transcription (Superscript II; Invitrogen Corp., 
Carlsbad, CA, USA) of total RNA purified from cells (miR- 
Neasy kit; Qiagen GmBH), in accordance with the manu- 
facturers' recommendations. 

RNA extraction 

For this experiment, three biological replicates of six 
HNSCC cell lines (SCC47, SCC90, 93VU, PCI30, 
SCC003, and SCC036), as described above, were grown 
in T75 flasks in DMEM supplemented with 10% FCS 
and 1% penicillin/ streptomycin, then RNA was extracted 
from (RNeasy Mini kit; Qiagen GmBH) in accordance 
with the manufacturer's recommended protocols, with 
DNAse digestion. cDNA was synthesized from 500 ng 
RNA (Superscript II cDNA synthesis kit; Invitrogen) in 
accordance with the manufacturer's instructions. 

Primers were designed for CDH8, PCDH10, DNMT1, 
DNMT3a and DNMT3b using a primer design tool 
(Integrated DNA Technologies, Coralville, IA, USA). 
Sequences are shown in Table 1. 

Master mixes were made up with SYBR Green (Applied 
Biosystems, Foster City, CA, USA), 0.2 umol/1 primers and 
1 ul cDNA in 10 ul reactions, and qPCR was carried out 
on a thermal cycler (Realplex 4 Mastercycler; Eppendorf 
AG, Hamburg, Germany) for 10 minutes at 95°C, followed 
by 40 cycles of 15 seconds at 95°C and 60 seconds at 60°C. 
Relative expression was defined in terms of fold change of 
expression between the cluster of three negative cell lines 
relative to three positive cell lines for CDH8 and PCDH10, 
and vice versa for DNMT3a, DNMT3b, and DNMT1 
using the AA Ct method on Ct values obtained from qPCR. 
P values were calculated using the two-tailed Student's 
f-test across Ct values for the six cell lines from three 
independent experiments. 

Results 

HPV+ tumors have a distinct DNA-methylation signature 

To investigate whether HPV+ and HPV- tumors have dis- 
tinct epigenetic signatures, we performed genome-wide 
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Table 1 Primers used for quantitative PCR. 



Primer Direction Sequence 5'->3' 



DNMT3a 


Forward 


CTGGGAGGAAGCGCAAG 




Reverse 


CCATTGGGTAATAGCTCTGAGG 


DNMT3b 


Forward 


CCCATTCGAGTCCTGTCATTG 




Reverse 


TTGATA~TTGCCGTCGTGCTrC 


DNMT1 


Forward 


GAAGTGAAGCCGGTAGAGTG 




Reverse 


GGTGCTTTTCCTTGTAATCCTG 


CDH8 


Forward 


CTCTTCACCGACTTACCTACTTG 




Reverse 


ATGTTGAACTGCGTCTCCAG 


PCDH10 


Forward 


GACAGTGAACAGGGAGATAGTG 




Reverse 


TCAGAAGGGACAAAAGAAGGC 



DNA-methylation profiling using the 450 k Illumina Infi- 
nium platform [47], which allows the methylation state of 
over 480,000 cytosine sites (mostly CpG sites) to be inter- 
rogated. FFPE samples from 21 HPV+ and 21 HPV- 
tumors were analyzed. Raw data were subjected to a strin- 
gent quality-control analysis (Methods). This resulted in a 
raw data matrix of 439,385 probes and 32 samples (18 
HPV+ and 14 HPV-). This raw data matrix was then sub- 
jected to principal component analysis to determine the 
nature of the largest components of variation. Using RMT 
[36,37], we estimated a total of nine significant compo- 
nents of variation, which were mainly correlated with bio- 
logical factors. The first two components correlated with 
HPV status, and confirming this, a scatter plot along these 
showed that samples segregated according to HPV status 
(Figure 1A, B). Importantly, there was no substantial varia- 
tion associated with technical factors, including Sentrix 
position or identification (Figure 1A). 

Because the first two principal components of these data 
corresponded most strongly with HPV status, we naturally 
expected that unsupervised clustering over the most vari- 
able probes would result in segregation of samples accord- 
ing to HPV status. As defined previously [48], such probes 
or CpG sites are referred to as MVPs and as hyper-MVPs 
or hypo-MVPs when directionality towards differential 
hypermethylation or hypomethylation has been ascer- 
tained. Segregation was confirmed by consensus clustering 
of the top 250 MVPs (Figure 1C). 

Next, we performed a supervised analysis to ascertain 
the association between DNA methylation and HPV 
status. To rank probes, we used a Bayesian regularized t- 
statistics model [49], which has been used and validated 
in the context of DNA-methylation data [50]. Consistent 
with the previous unsupervised analysis, a histogram of 
-P-values from the supervised analysis showed a clear 
trend towards small significant P-values (Figure 2A). 
Using two alternative procedures (^-values [39] and a 
permutation approach; see Methods), we found 2,757 
MVPs with an FDR rate of less than 0.01, that is, less 
than 1% of the 2,757 MVPs are expected to be false 



positives. Of these 2,757 MVPs, the overwhelming major- 
ity (2,408; 87%) were hyper-MVPs in HPV+ samples, 
compared with HPV-, indicating that HPV infection is 
associated with widespread gain of DNA methylation. 
The MVPs indicating differential methylation between 
HPV+ and HPV- samples were independent of gender. 
Indeed, we derived ranked sets of MVPs associated with 
HPV status for the 24 men and 8 women separately, and 
the resulting statistics had high correlation (Figure 2B). 

To investigate if the directional DNA-methylation 
changes were related to the position of the MVPs relative 
to the corresponding genes, we first categorized each 
MVP into one of six gene-feature groups (transcription 
start site (TSS)1500, TSS200, 5' untranslated region 
(UTR), first exon, body, and 3' UTR). We found that 
hyper-MVPs in HPV+ samples were preferentially located 
upstream or near the TSS or in gene bodies but not in 
first exons, whereas hypo-MVPs in HPV+ samples were 
preferentially located in gene bodies (Figure 2C). Taken 
together, these data clearly show that the HPV+ tumor 
samples have a distinct epigenetic signature, which shows 
a significant skew towards hypermethylation. 

HPV+ are heterogeneous, with a candidate CpG island 
methylator phenotype 

The enrichment of hyper-MVPs in HPV+ samples and 
the observation that many of these mapped to CpG 
islands suggested a possible association with CpG island 
methylator phenotype (CIMP) in these samples. To 
investigate this further, we performed consensus cluster- 
ing over the top 1,000 MVPs, a procedure similar to the 
one used previously to discover CIMP phenotypes in 
breast and brain cancer [51,52]. The consensus clustering 
yielded four clusters, which still correlated with HPV sta- 
tus, but also showed heterogeneity within the HPV+ and 
HPV- subtypes (Figure 3). Specifically, we found two 
main sub-groups of HPV+ samples, with subtype la exhi- 
biting higher methylation levels (Figure 3). This subtype 
was also characterized by higher average methylation 
levels when the MVPs were restricted to CpG islands 
(see Additional File 1, Figure S4), suggestive of CIMP. 
However, and in contrast to the CIMPs reported in 
breast, colon, and brain cancers, there was no evidence of 
a stronger correlated hypermethylation pattern in this 
subtype than in the rest of HPV+ tumors. Interestingly, 
the patient samples in our candidate CIMP cluster la all 
had poor outcome, exhibiting significantly shorter survi- 
val times compared with cluster lb, which contained 
mostly samples from patients with good outcome (log- 
rank P = 0.001, Figure 3; see Additional File 1, Figure S4). 
It is noteworthy that there was no significant association 
with viral load, relative amount of pl6-positive tumor 
cells, or expression of viral transcripts between these two 
sub-groups. 
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Figure 1 Unsupervised analysis of the top 250 methylation variable positions (MVPs) in formalin-fixed, paraffin wax-embedded 
(FFPE) human papillomavirus-positive (HPV+) and HPV-negative (HPV-) tumor samples (A) Singular value decomposition: PC-k 
denotes the kth principal component, DA denotes survival at censoring date. The first two principal components PC-1 and PC-2 most 
strongly correlated with HPV status, whereas the remaining significant components associated with clinical parameters, including alcohol 
consumption, smoking, age, sex, tumor stage, and grade. No association was found with technical factors (such as Sentrix position and 
Sentrix ID) on the array. (B) The first two principal components clearly distinguish HPV status. HPV+ samples are plotted in black, HPV- 
samples are in red, and m/f indicates male/female. (C) Clusters inferred by the unsupervised consensus-clustering algorithm for the top 250 
MVPs as found using the MAD estimator. 
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Figure 2 Supervised differential methylation analysis reveals human papillomavirus-positive (HPV+) signature, showing a skew 
towards hypermethylation. (A) Histogram of P-values from the supervised analysis a clear trend towards small significant P values. (B) 
Independence of HPV status and sex: scatter plot of t-statistics of individual CpGs reflecting HPV status (positive t-statistics indicate 
hypermethylation in HPV-infected samples), wiht P value computed using Wilcoxon rank sum test. (C) Methlyation status according to gene- 
feature annotation, showing a clear trend towards hypermethylation (P = 0.017). Gene features: TSS1500, TSS200, 5' untranslated region (UTR), 
first exon, gene body, 3' UTR. 
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Figure 3 Unsupervised analysis of the top 1,000 MVPs in formalin-fixed, paraffin wax-embedded (FFPE) human papiliomavirus- 
positive (HPV+) and HPV-negative (HPV-) tumor samples. (A) Consensus clustering identified four sub-groups in HPV+ and HPV- groups, 
revealing sub-group la as candidate CIMP within the HPV+ group. (B) Clusters inferred by the unsupervised consensus-clustering algorithm for 
the top 1,000 MVPs as found using the MAD estimator. 



Validation of the hypermethylation signature in HPV+ tumors 

To validate our findings, we performed 450 k Infinium 
profiling on six independent FF) samples (three HPV+ 
and three HPV-). All six samples passed our quality-con- 
trol criteria (see Methods). We applied the same Bayesian 
supervised analysis to rank MVPs according to how well 
they discriminated the three HPV+ from the three HPV- 
samples. The overwhelming majority of MVPs that were 
significantly hypermethylated in HPV+ FFPE samples 
were also hypermethylated (and many significantly 
hypermethylated) in the HPV+ FF samples relative to 
HPV- samples. Comparing the regularized f-statistics 
obtained from the 32 FFPE samples with those obtained 
from the six FF samples, there was a very strong agree- 
ment (P = 3 x 1CT 35 ; Fi gure 4A). A control set of probes, 
which did not differ between HPV+ and HPV- FFPE 
samples, also did not correlate with HPV status in the FF 
set (Figure 4B). To further validate our findings and the 
450 k technology, we compared the 450 k methylation 
values for CpG islands with the methylation scores calcu- 
lated from MeDIP-seq using the MEDIPS package, test- 
ing the three HPV+ FF and three HPV- FF samples. We 
found strong agreement between the two methods (see 
Additional File 1, Figure S5). 

Next, we investigated if the DNA-methylation changes 
associated with HPV status were also present in HPV- 
infected HNSCC cell lines. HPV i-statistics between the 



FFPE and HPV cell-line experiments correlated strongly 
(Fisher test, P = 2 x 10" 27 , Figure 4C). Changes in absolute 
mean beta value (Ap) between HPV+ and HPV- cell lines 
were substantially larger than the changes detected in 
FFPE (paired Wilcoxon test P = 7 x 10~ 15 ), and a gain lar- 
ger than in FF tumor samples (paired Wilcoxon test P = 3 
x 10" see Additional File 1, Figure S6A). In conclusion, 
we identified 2,194 consistent hyper-MVPs and 74 consis- 
tent hypo-MVPs across all three experiments (FFPE 
HNSCCs, FF HNSCCs, and HNSCC cell lines; Figure 4D). 
This confirms that our HPV+ hypermethylation signature 
obtained from FFPE samples was validated in an indepen- 
dent set of HNSCC samples, as well as in HPV+ HNSCC 
cell lines, and indicates a strong association of the 
observed methylation signature and HPV status. Consis- 
tent with the observed hypermethylation phenotype in 
HPV+ tumors and cell lines, real-time qPCR analysis 
showed increased mRNA expression of both the de novo 
DNA methyltransferase, DNMT3A (as described pre- 
viously [25]) and the maintenance DNA methyltransferase 
DNMT1 in HPV+ cell lines, compared with HPV- cell 
lines (see Additional File 1, Figure S7). 

Ectopic expression of the HPV-16 oncogene E6 partially 
phenocopies the hypermethylation signature 

To functionally validate our obtained HPV+ hypermethy- 
lation signature, we infected an HPV- HNSCC cell line 
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Figure 4 Validation of human papillomavirus-positive (HPV+) and HPV-negative (HPV-) methylation signature (A-C) Validation of 
consistency of f-statistics between formalin-fixed, paraffin wax-embedded (FFPE) and (A) fresh-frozen (FF) samples, (B) FF control probes and (C) 
HPV+ against HPV- cell lines. (D) Heatmap representation of signature of consistent hypermethylated methylation variable positions (hyper-MVPs; 
top) and hypomethylated MVPs (hypo-MVPs; bottom) in the Infinium DNA-methylation data. The DNA-methylation (fj) values are represented 
using a color scale from yellow (low DNA methylation) to blue (high DNA methylation) normalized across each MVP. The HPV+ head and neck 
squamous cell cancer (HNSCC) methylation signature contains 2,194 consistent hyper-MVPs and 74 consistent hypo-MVPs across all three 
datasets. The six HNSCC cell-line samples were run in duplicate. 
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with lentiviral vectors containing either or both of the 
HPV-16 oncogenes £6 and £7. After confirmation of 
ectopic expression of these HPV oncogenes in three 
clones of each cell line (see Additional File 2, Table S8), 
we performed DNA-methylation profiling on £6-infected, 
£6&£7-infected and £7-infected clones relative to empty 
vector controls. 

The skew towards hypermethylation (seen in the 
described experiment on FFPE HNSCCs) was confirmed 
to be highly significant in £6 and E6&cE7 clones against 
the background probability of hypermethylation (there 
was widespread hypermethylation in E6 and £6&£7 
clones) (Figure 5 A, Monte Carlo, P = 0.007). The distri- 
bution of methylation changes in E6, E6&.E7, and E7 
clones compared with controls is illustrated (see Addi- 
tional File 1, Figure S8). In contrast to E6 and E6&E7, E7 
was shown not to contribute to this hypermethylation 
signature (comparison of clones infected with E7 com- 
pared with controls; one-sided Wilcoxon, P = 1). The 
skew towards hypermethylation was significantly larger 
for E6 than for £6&£7, consistent with the lower expres- 
sion levels of E6 in £6&£7-co-infected clones compared 
with £6-infected clones (see Additional File 2, Table S8). 
The results of the entire experiment are summarized in 
two graphs (Figure 5A, B). In conclusion, ectopic expres- 
sion of E6 (but not E7) in HPV- HNSCC cell-line clones 
partially phenocopies the hypermethylation signature 
seen in HPV+ HNSCC tumors. 



Meta-analysis of HPV+ and HPV- HNSCC and of publicly 
available methylation data for cervical and lung cancer 

To test further the effect of HPV on DNA methylation, 
methylation data obtained from the 18 HPV+ and 14 
HPV- HNSCC samples were integrated with publicly 
available methylation data on HPV-induced versus 
smoking-induced cancer, 48 cervical-cancer samples 
[53] and 59 lung-cancer samples [54]. Using a selection 
of HPV-associated versus smoking-associated features 
(see Methods), identified by comparing HPV+ with 
HPV- HNSCC, MDS of the datasets using a simple 
Euclidean distance measure was applied, and distances 
were plotted. An overlap of cervical-cancer samples and 
HPV+ HNSCC samples was found (Figure 6). Signifi- 
cance of this observation was further tested using a Wil- 
coxon rank sum test on inter-sample distances. When 
focusing upon HPV+/HPV- methylation signatures, the 
methylation pattern of cervical-cancer samples was 
more closely related to the HPV+ signature seen in 
HNSCC (P < 2.2 x 10' 16 ). This suggests that HPV 
induces a distinct methylation signature that is indepen- 
dent of tissue-specific DNA methylation. 

Enrichment of PRC2 targets, especially members of the 
cadherin superfamily, within the hypermethylation signature 

To find consistent targets across all the data sets 
(FFPEs, FFs, and HPV+ cell lines), we assigned all of 
the consistent hyper-MVPs and hypo-MVPs identified 
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Figure 5 Validation of consistent hypermethylated methylation variable positions (hyper-MVPs) in E6 and E6&E7 infected cell-line 
clones. (A) The formalin-fixed, paraffin wax-embedded (FFPE) hyper-MVP signature consistent with E6 (infected with E6 or E6&E7) versus empty 
vector controls (Monte Carlo P= 0.007). Volcano plot shows t-statistics of E6 versus empty clones plotted against log 10 FFPE P values. (B) Heat- 
map representation of consistent hyper-MVPs in clones infected with E6, E6&E7, or E7 and empty vector controls. Yellow indicates relative 
hypomethylation in HPV+ samples and blue indicates hypermethylation (MVPs normalized across samples). 
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Figure 6 Multidimensional scaling using the four datasets 

These datasets were comprised of 48 cervical-cancer samples (CERV; 
pink), 59 lung-cancer samples (LUNG; purple), 18 human 
papillomavirus-positive (HPV+) head and neck squamous cell cancer 
(HNSCC) samples (HPV1; light-blue) and 14 HPV-negative (HPV-) 
HNSCC samples (HPVO; green), using a selection of HPV-associated 
versus smoking-associated features identified by comparing HPV+ 
versus HPV- HNSCC. 



above to genes, and ran a gene-set enrichment analysis. 
The hyper-MVPs, which make up the majority (96.7%) 
of MVPs, identified several SUZ12 and PRC2 targets 
(see Additional File 2, Table S9), including multiple 
members of the cadherin superfamily. Indeed, there 
was significant (11 probes in total, Fisher exact test, 
P = 4x 10" 7 ) enrichment of hyper-MVPs within the 
cadherin genes. This was also the case in the top 1000 
MVPs (Fisher P = 0.0003). We tested the possible bio- 
logical effects of these cadherins in three separate 
ways, by showing that 1) their methylation state was 
sufficient to accurately cluster samples in accordance 
with HPV status, 2) they have consistent and signifi- 
cant hypermethylation across their promoter regions 
(see section on differentially methylated regions 
(DMRs); and 3) this promoter hypermethylation associ- 
ates with decreased gene expression in existing data 
(see Expression section). Using the /c-medoids cluster- 
ing algorithm (pam in the R package survival), we 
found that the 11 cadherin-annotated probes within 
the top 1,000 MVPs were sufficient to detect HPV sta- 
tus (84% correctly classified (27/32), Fisher exact test, 
P = 0.0002). These 11 MVPs mapped to CpG islands, 
shores, or shelves of six cadherin genes (CDH8, CDH15, 
PCDH8, PCDH9, PCDH10, and PCDHB3). The remaining 



3.3% hypo-MVPs were enriched for two gene sets pre- 
viously shown to display upregulation of gene expression 
in HPV+ head and neck cancers [16,14]. Among the top 
100 hits, we also found consistent enrichment of genes 
involved in DNA replication and binding, the mitogen- 
activated protein kinase pathway and E2F targets (see 
Additional File 2, Table S10). 

To assess the MVP associations in a more biologically 
relevant context, we grouped them into DMRs if at least 
three (range three to seven) had correlated differential 
methylation levels within the TSS200 promoter region. 
TSS200 was chosen because it was most significantly 
positively (P = 2.4 x 10 s ) enriched category of the six 
tested (see Additional File 1, Figure S9). Applying this 
filter, the 2,194 consistent hyper-MVPs mapped to 906 
distinct genes, with 416 having at least three probes in 
their respective TSS200 regions. From these 416 genes, 
we derived 43 hypermethylated TSS200 DMRs (yp > 
0.1) across FFPE HNSCCs, FF HNSCCs, and HNSCC 
cell lines; all Wilcoxon paired P values < 0.05). A sam- 
ple-permutation approach yielded an expected 4.4 false 
positives (see Methods). HPV cell lines showed the lar- 
gest changes in mean TSS200 hypermethylation, signifi- 
cantly larger than FFPE (paired Wilcoxon, P = 5 x 10" 

) which showed significant hypermethylation relative 
to FF (paired Wilcoxon, P = 1 x 10" ; see Additional 
File 1, Figure S6B). Using the same approach for the 74 
consistent hypo-MVPs, we derived five hypomethylated 
TSS200 DMRs. Exemplar profiles of a hyper-DMR for 
CDH8 and a hypo-DMR for MEI1 (Figure 6_ highlight 
the increasing power to detect MVPs and DMRs depen- 
dent on cell-type purity (cell line > laser-capture micro- 
dissected FFPE > FF). All DMRs associated with 
cadherin genes had sample-permutation estimated P 
values of less than 0.05 ((or profiles of these, see Addi- 
tional File 1, Figure S10). In summary, we found 43 
genes with promoter hypermethylation consistently 
across all datasets (permutation FDR 10%) including 
multiple cadherin genes and other PRC2 targets. In 
addition, we found five genes (SNTB1, CYP7B1, MEI1, 
ICA1, and FAM163A) with hypomethylated promoter 
DMRs. 

Integration with publicly available gene-expression data 

For additional functional evidence of the effect of DNA- 
methylation changes on gene expression, we compared our 
methylation differences between HPV+ and HPV- FFPE 
tumor samples with publicly available gene-expression data 
[14]. The top 500 MVPs mapping to CpG islands were 
compared with the differential expression t-statistics of 
their associated genes. We found a significant negative cor- 
relation (Fisher test, P = 2 x 10" 18 ; Figure 8A). A list of 
genes with consistent TSS200 DMRs across all datasets 
(FFPE HNSCCs, FF HNSCCs, HNSCC cell lines), and 
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Figure 7 Exemplar profiles of a hypermethylated differentially methylated region (hyper-DMR) for CDH8 and a hypomethylated DMR 
(hypo-DMR) for MEM. Comparison of DMR profiles obtained from formalin-fixed, paraffin wax-embedded (FFPE) head and neck squamous cell 
cancers (HNSCCs), fresh-frozen (FF) HNSCCs, and HNSCC cell lines. The profiles clearly show the increasing power to detect methylation variable 
positions and differentially methylated regions (DMRs) is dependent on cell-type purity (cell line > laser-capture microdissected FFPE > FF). 
Feature annotation is as provided by BeadChip, and methylation values are color-coded accordingly: TSS1500, orange (1500 bp to 200 bp 
upstream of the transcription start site (TSS)); TSS200, red (200 bp upstream of the TSS); 5' untranslated region (UTR), yellow; gene body, blue; 
CpG islands, black; CpG shores, grey; and CpG shelves, light grey. 



which also exhibited differential gene expression in the 
independent Pyeon gene-expression dataset, is shown in 
Figure 8B. Among these were three cadherin genes (CDH8, 
PCDH10 and PCDHB11). These data are consistent with 
cadherin genes being targets for HPV-mediated hyper- 
methylation and transcriptional silencing in HNSCC. 



CDH 8 and PCDH10 are hypermethylated and silenced in 
HPV+ HNSCC cell lines 

To confirm that genes with promoters that are differen- 
tially methylated between HPV+ and HPV- cell lines are 
also differentially expressed, we carried out qPCR for 
CDH8 and PCDH10. We found that CDH8 and 
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Figure 8 Integration of DNA-methylation data with public gene-expression data (A) DNA methylation correlates with decreased gene 
expression: scatter plot of t-statistics between human papillomavirus-positive (HPV+) and HPV-negative (HPV-) formalin-fixed, paraffin wax- 
embedded (FFPE) cancer samples (top 500 differentially methylated MVPs restricted to CpG islands) shows significant anti-correlation between 
DNA methylation and gene expression. Gene expression-data were taken from Pyeon ef al., [14]. (B) List of top 10 anti-correlated targets: 
Differentially methylated genes in promoter region (TSS200) which also exhibit differential gene expression in the independent Pyeon gene- 
expression dataset. 3 Cadherin genes were found among the top 10 hits (bold). 
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PCDH10 were significantly overexpressed in our panel 
of three HPV- HNSCC cell lines relative to three HPV+ 
cell lines, correlating with hypermethylation in the latter 
(see Additional File 1, Figure Sll). 

Discussion 

The findings reported here represent the most compre- 
hensive epigenetic study of a virus-induced cancer to 
date, and the first to validate the existence of an HPV- 
mediated DNA-methylation signature in HPV+ HNSCC. 
Supported by extensive validation using independent 
samples and different methods, the signature showed a 
clear skew towards hypermethylation, which was most 
prominent at promoter regions (defined by TSS200). 
However, there was also significant hypomethylation at 
gene bodies, which, together with promoter hypermethy- 
lation, is a clear hallmark of gene silencing [55] . It is well 
documented, for instance, that hypermethylation of the 
promoter region of tumor-suppressor genes plays an 
important role in cellular transformation [20], and 
indeed, we found consistent hypermethylation (defined 
by both hyper-MVPs and hyper-DMRs) in the promoter 
regions of such genes, as well as in a candidate CIMP in 
the HPV+ samples. CIMPs have been reported for a 
number of cancers, including neuroblastoma [56], colon 
cancer [57], brain cancer [52], and breast cancer [51]. 
Except for neuroblastoma and possibly few other cancers, 
CIMP has been associated with a favorable clinical out- 
come. HPV+ HNSCC are also associated with a more 
favorable outcome [9] but the candidate CIMP found in 
the current study was only present in a sub-group of four 
HPV+ patients, who all had a shorter survival time and 
recorded death. To our knowledge, this is the first time 
that a CIMP (albeit a candidate CIMP) has been reported 
for HNSCC, representing a second example of CIMP 
being associated with potentially less favorable clinical 
outcome. Furthermore, we were able to show that our 
signature (defined by top 1,000 MVPs) was independent 
of gender and predictive for smoking status and length of 
survival, confirming previous findings [7,9,58]. 

The inclusion of multiple sample types (FFPE, FF, and 
cell lines) in the validation part revealed an important 
observation, with direct implications for projects with an 
epigenetic biomarker component, such as ICGC [59], 
IHEC [60], OncoTrack [61] and others. Although FF 
samples have emerged as the gold standard for the geno- 
mic analysis of cancer, our data show that archival FFPE 
samples may be superior for certain epigenomic analyses, 
particularly when combined with LCM, as illustrated in 
Figure 7. The largest differences in DNA-methylation 
levels were consistently found in cell lines, followed by 
laser-microdissected FFPE, followed by FF. This general 
trend was expected because DNA methylation is known 



to be cell type-specific but the evident high level of con- 
founding cellular heterogeneity (resulting in dilution of 
the respective MVP/DMR signals) in carefully biobanked 
FF samples is nevertheless noteworthy. 

The most interesting finding arising from the gene set 
enrichment analysis is the enrichment of numerous 
members of the cadherin superfamily, which are targets 
of PRC2 and are implicated in many cancers and cancer- 
specific processes [62], including epithelial to mesenchy- 
mal transition (EMT), a process by which carcinomas 
become invasive and acquire the ability to metastasize 
[63]. Notable examples include E-Cadherin (CDH1), T- 
Cadherin (CDH13) and Proto-Cadherin 10 (PCDH10) 
which are recognized tumor-suppressor genes, and have 
been found to be hypermethylated in a number of human 
cancers [62]. Among the 49 PRC2 targets (defined by 
consistent hyper-MVPs) identified here were 10 genes of 
the cadherin superfamily in HPV+ HNSCC, including 
CDH8 and CDH13 (both also hypermethylated in cervical 
cancer [64], CDH18, CDH19, CDH23, PCDH10, PCDH15, 
PCDHB1, PCDHB4, and PCDHB1S. Moreover, the 11 
MVPs in 6 cadherin genes identified among the top 1,000 
MVPs by unsupervised clustering analysis of FFPE 
HNSCC samples, warrant further investigation as potential 
biomarkers because they clustered our HPV+ and HPV- 
samples according to HPV status with high accuracy. 
Although we were able to show that DNA-methylation 
analysis is suitable for patient stratification according to 
HPV status, it was not more effective than mutation analy- 
sis or immunostaining with pl6. Therefore, combinatorial 
testing may be the clinically most effective way to stratify 
patients for HPV in the future. 

We obtained two lines of evidence with respect to func- 
tional support for the identified hypermethylation signa- 
ture. First, we were able to partially phenocopy the 
signature by ectopic expression of the two HPV oncogenes 
E6 and E7 in an HPV- HNSCC cell line. Combinatorial 
analysis showed that E6 is the main viral effector gene. 
The underlying mechanism remains unknown, and is sub- 
ject to future work such as analysis of cross-talk between 
E6 and DNA methyltransferases, effect of E6 on TP53, and 
number and distribution of viral integration sites into the 
host genome and the viral methylome itself. Second, we 
integrated publicly available expression data with our 
DNA-methylation data [14]. Among the top 10 anti-corre- 
lated (high promoter methylation and low expression) 
genes were three of the cadherins, namely CDH8, 
PCDH10, and PCDHB11. The inverse scenario (low pro- 
moter methylation and high expression) was also found, 
and both are likely to contribute to the different clinical 
behavior of HPV+ and HPV- HNSCC with regard to sur- 
vival and response to therapy. Linking these two lines of 
evidence suggests a possible mechanism whereby HPV 
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could drive tumor progression by promoting EMT [63] 
through epigenetic silencing of cadherins, in addition to 
its established role in tumor initiation. 

Conclusions 

This work significantly advances our understanding of the 
epigenetic dynamics at genomic loci targeted by oncogenic 
viruses as shown here for loci associated with the infection 
of HPV in HNSCC. Based on the previously established 
finding that patients with HPV+ HNSCC have a better 
prognosis than do patients with HPV- HNSCC, it is 
tempting to speculate that this advantage may be partly 
epigenetically mediated. Our results certainly implicate 
DNA methylation in this process. If confirmed, targeted 
reprogramming of the identified HPV-mediated hyper- 
methylation signature (or parts of it) in patients who suffer 
from HPV-cancer offers a potential translational applica- 
tion for our findings. Although still at an early experimen- 
tal stage, targeted reprogramming has recently been 
reported, including in cancer cells [65,66]. In the longer 
term, these data will contribute to the identification of 
diagnostic and prognostic markers and of putative thera- 
peutic targets. 
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